Recovery Schemes for Mesh Arrays Utilizing
نویسندگان
چکیده
Error recovery capability is examined in processing arrays that employ spare nodes for fault tolerance. Spares can provide fault tolerance to high-performance single-package arrays, where it is not feasible to repair faulty subsystems. The cost of such a fault-tolerance solution , redundant hardware that idles until needed, may not be practical. Manufacturers must be ooered hardware solutions to fault tolerance that provide useful work at all times. In this paper, new schemes are presented in which idling spares can be utilized to improve error recovery. Without expedient error recovery, computation in environments experiencing frequent errors can be burdened with extra cost in terms of job completion time. Further, in such environments, a job may never be able to reach completion. Spares will aid in the validation and in the selection of recovery points in systems experiencing randomly distributed errors. Successful job completion in environments of error bursts is performed with the aid of a scheme 1] that identiies reliable data when periodic on-line testing is available. Spares will help identify the boundaries of reliable data. We consider these features in mesh arrays that are used in digital signal processing applications. Preliminary simulations highlight the overhead of our schemes in terms of job completion times in environments burdened with transient errors. In general, when a system fails, operation must stop, faulty components must be identiied and reconngured, and the current job must be restarted from the beginning. Error recovery is the ability to continue healthy operation without losing all computation results, that is, without having to return to the job start. In transient error environments, where frequent errors are expected due to power supply jitter, noise and/or radiation 5], the cost of starting a failed job at the beginning can have signiicant impact in terms of the time-to-complete the job. Further, the job may never be able to reach completion. In this paper, we provide the additional capability of error recovery to processing arrays that utilize spare nodes to achieve fault coverage. Existing coverage schemes 7] call for spares to idle until required for coverage. Valuable real estate is taken up by idling circuitry, and further, power is sacriiced. These unresolved, practical issues plus the cost of extra hardware have hindered spares-oriented schemes from widespread use in the eld. The reliable operation of large single-package array systems, necessary to achieve high performance levels and size reduction depends on fault tolerance. Redundant …
منابع مشابه
Optimal Routing and Channel Assignments for Hypercube Communication on Optical Mesh–like Processor Arrays
This paper considers optimal routing and channel assignment (RCA) schemes to realize hypercube communication on optical mesh–like networks. Specifically, we identify lower bounds on the number of channels required to realize hypercube communication on top of array and ring topologies and develop optimal RCA schemes that achieve the lower bounds on these two topologies. We further extend the sch...
متن کاملNew adaptive interpolation schemes for efficient meshbased motion estimation
Motion estimation and compensation is an essential part of existing video coding systems. The mesh-based motion estimation (MME) produces smoother motion field, better subjective quality (free from blocking artifacts), and higher peak signal-to-noise ratio (PSNR) in many cases, especially at low bitrate video communications, compared to the conventional block matching algorithm (BMA). Howev...
متن کاملComparison of three different numerical schemes for 2D steady incompressible lid-driven cavity flow
In this study, a numerical solution of 2D steady incompressible lid-driven cavity flow is presented. Three different numerical schemes were employed to make a comparison on the practicality of the methods. An alternating direction implicit scheme for the vorticity-stream function formulation, explicit and implicit schemes for the primitive variable formulation of governing Navier-Stokes equatio...
متن کاملMesh Routing Topologies for Multi-FPGA Systems
There is currently great interest in using fixed arrays of FPGAs for logic emulators, custom computing devices, and software accelerators. An important part of designing such a system is determining the proper routing topology to use to interconnect the FPGAs. This topology can have a great effect on the area and delay of the resulting system. Tree, Bipartite Graph, and Mesh interconnection sch...
متن کاملStudy of Various Schemes for Link Recovery in Wireless Mesh Network
As there is a growing need for the cost effective and highly dynamic large-bandwidth networks over large coverage area , the Wireless Mesh Network provide first step towards effective communication. A Wireless Mesh Network is one of the most advanced wireless network used for communication. During their operating period , the wireless mesh network may suffer from frequent link failure which res...
متن کامل